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Nowadays, millions of people interact on a daily basis on online social media like Facebook and Twitter, 
where they share and discuss information about a wide variety of topics. In this paper, we focus on a specific 
online social network, Twitter, and we analyze multiple datasets each one consisting of individuals’ online 
activity before, during and after an exceptional event in terms of volume of the communications registered. 
We consider important events that occurred in different arenas that range from policy to culture or science. 
For each dataset, the users’ online activities are modeled by a multilayer network in which each layer conveys 
a different kind of interaction, specifically: retweeting, mentioning and replying. This representation allows 
us to unveil that these distinct types of interaction produce networks with different statistical properties, in 
particular concerning the degree distribution and the clustering structure. These results suggests that mod¬ 
els of online activity cannot discard the information carried by this multilayer representation of the system, 
and should account for the different processes generated by the different kinds of interactions. Secondly, 
our analysis unveils the presence of statistical regularities among the different events, suggesting that the 
non-trivial topological patterns that we observe may represent universal features of the social dynamics on 
online social networks during exceptional events. 
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Introduction 


.^he advent of online social platforms and their usage in the last decade, with exponential increasing trend, 
made possible the analysis of human behavior with an unprecedented volume of data. To a certain extent, 
^ online interactions represent a good proxy for social interactions and, as a consequence, the possibility to track 
lO the activity of individuals in online social networks allows one to investigate human social dynamics [T]. 
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More specifically, in the last years an increasing number of researchers focused on individual’s activity in 
Twitter, a popular microblogging social platform with about 302 millions active users posting, daily, more 
than 500 millions messages (i.e., tweets) in 33 language^ In traditional social science research the size of the 
population under investigation is very small, with increasing costs in terms of human resources and funding. 
Conversely, monitoring Twitter activity, as well as other online social platforms as Facebook and Foursquare 
to cite just some of them, dramatically reduces such costs and allows to study a larger population sample, 
ranging from hundreds to millions of individuals [2], within the emerging framework of computational social 
science [3]. 

The analysis of Twitter revealed that online social networks exhibit many features typical of social systems, 
with strongly clustered individuals within a scale-free topology [1]. Twitter data [5] has been used to validate 
Dunbar’s theory about the theoretical cognitive limit on the number of stable social relationships ElEl. It 
has been shown that individuals tend to share ties within the same metropolitan region and that non-local 
ties distance, borders and language differences affect their relationships [8]. Many studies were devoted to 
determine which and how information flows through the network [smuiiniiE], as well as to understand the 
mechanisms of information spreading - e.g., as in the case of viral content - to identify influential spreaders and 
comprehend their role [EHHIISKISIIIZI. Attention has also been given to investigate social dynamics during 


^ https://about.twitter.com/company 
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emergence of protests m, with evidences of social influence and complex contagion providing an empirical 
test to the recruitment mechanisms theorized in formal models of collective action m- 

Twitter allows users to communicate through small messages, using three different actions, namely mentioning, 
replying and retweeting. While some evidences have shown that users tend to exploit in different ways the 
actions made available by the Twitter platform m, such differences have not been quantified so far. In this 
work, we analyze the activities of users from a new perspective and focus our attention on how individuals 
interact during exceptional events. 

In our framework, an exceptional event is a circumstance not likely in everyday news, limited to a short amount 
of time - typically ranging from hours to a few days - that causes an exceptional volume of tweets, allowing 
to perform a significant statistical analysis of social dynamics. It is worth mentioning that fluctuations in the 
number of tweets, mentions, retweets and replies among users may vary from tens up to thousands in a few 
minutes, depending on the event. A typical example of exceptional event is provided by the discovery of the 
Higgs boson in July 2012 m, one of the greatest events in modern physics. 

We use empirical data collected during six exceptional events of different type, to shed light on individual 
dynamics in the online social network. We use social network analysis to quantify the differences between 
mentioning, replying and retweeting in Twitter and, intriguingly, our findings reveal universal features of such 
activities during exceptional events. 


2 Material &: Methods 

2.1 Material 

It has been recently shown that the choice of how to gather Twitter data may significantly affect the results. 
In fact, data obtained from a simple backward search tend to over-represents more central users, not offering 
an accurate picture of peripheral activity, with more relevant bias for the network of mentions HU. Therefore, 
we used the streaming Application Programming Interface (API) made available by Twitter, to collect all 
messages posted on the social network satisfying a set of temporal and semantic constraints. 

We consider different exceptional events because of their importance in different subjects, from politics to 
sport. More specifically, we focus on the Cannes Film Festival in 2013[^ (Cannes2013), the discovery of the 
Higgs boson in 2012|^ [21] (HiggsDiscovery2012), the 50th anniversary of Martin Luther King’s famous public 
speech “I have a dream” in 2013|^ (MLKing2013), the 14th lAAF World Championships in Athletics held 
in Moscow in 201S[^ (MoscowAthletics2013), the “People’s Climate March” - a large-scale activist event to 
advocate global action against climate change - held in New York in 201^ (NYClimateMarch2014) and the 
official visit of US President Barack Obama in Israel in 201S[^ (Obamalnlsrael2013). 

For each event, we collected tweets sent between a starting time U sind a final time tf containing at least one 
keyword or hashtag, as specified in Table It is worth remarking that in a few cases we complemented a 
dataset by including tweets obtained from the search API (at most 5% of tweets with respect to the whole 
dataset). 

2.2 Methods 

To understand the dynamics of Twitter user interactions during these exceptional events, we reconstruct, for 
each event, a network connecting users on the basis of the retweets, mentions and replies they have been 
the subject or object of. In the literature on Twitter data what is usually built is the network based on 

^https://en.Wikipedia.org/wiki/2013_Cannes_Film_Festival 

^https://en.Wikipedia.org/wiki/Higgs_boson#Discovery_of_candidate_bosoii_at_CERN 

^ https ://en. wikipedia.org/wiki/I_Have_a_Dream 

^https://en.Wikipedia.org/wiki/People^ s_Climate_March 

^https://en.Wikipedia.org/wiki/People^ s_Climate_March 

^https://en.Wikipedia.org/wiki/List_of_presidential_trips_made_by_Barack_0bama#2013 
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Figure 1: Volume of tweets, in units of number of messages posted per hour, over time for the six exceptional 
events considered in our study. 


the follower-followee relationships between users miHiE]. However, this kind of network only captures users’ 
declared relations and it does not provide a good proxy for the actual interactions between them. Users, 
in fact, usually follow hundreds of accounts whose tweets appear in their news feed, even if there is no real 
interaction with the majority of those individuals. Therefore, to capture the social structure emerging from 
these interactions we build instead a network based on the exchanges between users, which can be deduced 
from the tweets that they produce. In particular, there are three kinds of interactions that can take place on 
Twitter and that we will focus on: 

• A user can retweet (RT) another user’s tweet. This means that the user is endorsing a piece of infor¬ 
mation shared by the other user, and is rebroadcasting it to her/his own followers. 

• A user can reply (RP) to another user’s tweet. This represents an exchange from a user to another as a 
reaction of the information contained in a user’s tweet. 


Dataset 

Starting date 

Ending date 

Keywords 

Cannes2013 

06 May 2013 
05:23:49 GMT 

03 Jun 2013 
03:48:26 GMT 

Cannes film festival,Cannes, canneslive 
#cannes2013,#festivalcannes, #palmdor 

HiggsDiscovery 2012 

30 Jun 2012 
21:11:19 GMT 

10 Jul 2012 
20:59:56 GMT 

Ihc, cern, boson, higgs 

MLKing2013 

25 Aug 2013 
13:41:36 GMT 

02 Sep 2013 
08:16:21 GMT 

Martin Luther King 
#ihaveadream 

Moscow Athletics2013 

05 Aug 2013 
09:25:46 GMT 

19 Aug 2013 
12:35:21 GMT 

mos2013com, moscow2013, mosca2013 
moscu2013, ^athletics 

NY ClimateMarch2014 

18 Sep 2014 
22:46:19 GMT 

22 Sep 2014 
04:56:25 GMT 

peopleclimatemarch, peoplesclimate 
marciaxilclima, climate2014 

Obamalnisr ael2013 

19 Mar 2013 
15:56:29 GMT 

03 Apr 2013 
21:24:34 GMT 

obama, israel 
palestina, peace 


Table 1: Information about events used in this work. Note that starting and ending dates reported here 
consider only tweets where users perform a social action, i.e. tweets without mentions, replies or retweets are 
not considered. 
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• A user can mention (MT) another user in a tweet. This represents an explicit share of a piece of 
information with the mentioned user. 

A fourth kind of possible interaction is to favourite a user’s tweet, which represents a simple endorsement of the 
information contained in the tweet, without rebroadcasting. However we do not have this kind of information 
for this dataset and therefore we do not consider this kind of interaction. 

As just discussed, each kind of activity on Twitter (retweet, reply, and mention) represents a particular 
kind of interaction between two users. Therefore an appropriate framework to capture the overall structure 
of these interactions without loss of information about the different types is the framework of multilayer 
networks [221 E3 El E3 ESI EZl- More specifically, in the case under investigation the more appropriate 
model is given by edge-colored graphs, particular multilayer networks where a color is assigned to different 
relationships - i.e., the edges - among individuals defining as many layers as the number of colors. We refer 
to [28] and [29] for thorough reviews about multilayer networks. 

Here, for each event, we build a multilayer network composed by L = 3 layers {RT,RP,MT}, corresponding 
to the three actions that users can perform in Twitter, and N nodes, being N the number of Twitter users 
interacting in the context of the given event. A directed edge between user i and user j on the RT layer is 
assigned if i retweeted j. Similarly, an edge exists on RP layer if user i replied to user j, and on MT layer if i 
mentioned j. An illustrative example is shown in Figure 

Details about the number of nodes and edges characterizing each event are reported in Table We can observe 
that the number of nodes and edges can vary importantly across events and across layers, but for each event 
and each interaction type the size of the corresponding networks is sufficient to allow a statistically significant 
analysis of the data. 


3 Results 


In the following we present an analysis of the networks introduced in the previous section, which is oriented at 
exploring two different but complementary questions. 

Firstly we want to know if, within one same event, the three kinds of interactions produce different network 
topologies. To this aim, we consider basic multilayer and single-layer network descriptors relevant to charac¬ 
terize social relationships, and we study how they vary when considering different layers. 


Event 

Aggregate 

RT 

RP 

MT 

Cannes2013 

N = 514,328 

E = 700,492 

337,089 

490,268 

85,414 

82,952 

91,825 

127,272 

HiggsDiscovery2012 

N = 747,659 

E = 817,877 

434,687 

542,808 

167,385 

122,761 

145,587 

152,308 

MLKing2013 

N = 346,069 

E = 339,143 

286,227 

288,543 

24,664 

18,157 

35,178 

32,443 

Moscow At hletics2013 

N = 103,319 

E = 144,591 

73,377 

102,842 

11,983 

12,768 

17,959 

28,981 

NY ClimateMarch2014 

N = 115,284 

E = 239,935 

94,300 

213,158 

7,900 

8,038 

13,084 

18,739 

Obamalnlsrael2013 

N = 2,641,052 
E = 2,926,777 

1,443,929 

1,807,160 

737,353 

586,074 

459,770 

533,543 


Table 2: Number of nodes and edges of the network corresponding to each event considered in this study. The 
second column reports the total number of nodes and edges, corresponding to a network in which information 
is aggregated. The last three columns report the number of active nodes and edges per layer. A node is 
considered active on a given layer if the corresponding user is the subject or the object of the corresponding 
kind of interaction. 
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Figure 2: Illustrative example of a multilayer network representing the different interactions between Twitter 
users in the context of an exceptional event. Different colors are assigned to different actions. 


Secondly, we want to unveil if different exceptional events present any common pattern regarding users inter¬ 
actions. As shown in Figure the temporal pattern of the different events considered in our study presents 
highly heterogeneous profiles. Some events are, in fact, limited to one day or only to a few hours, whereas 
others span over a week or more, and the profile of tweets volume varies accordingly. However, despite of these 
differences, do the user interactions that take place during these events present any common feature? 

3.1 Edge overlap across layers 

To understand if the kinds of interaction produce similar networks or not, we analyze if users interact similarly 
with each other regardless of the type of activity (retweet, reply or mention), or not. This information can be 
obtained by calculating the edge overlap [sniEn] between each pair of layers. However, when the number of 
edges is very heterogeneous across layers, a more suitable descriptor of edge overlap is given by 

^ lEgHEisl 

mm{\Ea\,\Efs\y 

where {Ef^) is the set of edges belonging to layer a (/3) and | • | indicates the cardinality of the set. This 
measure quantifies the proportion of pair-wise interactions - represented by the edges - that are common to 
two different layers. Because, as shown in Table the number of edges can vary largely on the different layers, 
the normalization is given by the cardinality of the smallest set of edges, to avoid biases resulting from the size 
difference. The results are reported in Figure Each value is obtained by averaging over the different events. 
The standard deviations are not shown in the figure for the sake of clarity, but are reported in Table We 
see that, for every couple of layers, (a, /3), ^ 1. This result indicates that different layers contain different 

pairwise interactions, i.e. the users that we retweet are not necessarily the same that we mention or we reply 
to, for example. This result suggests that considering the different activities separately might be very relevant 
in order to understand human interaction dynamics on Twitter. 


Layer pair 

Edge overlap 

Degree-degree correlation 

MT-RP 

MT-RT 

RP-RT 

0.05 ± 0.04 

0.06 ± 0.03 

0.08 ± 0.04 

0.50 ±0.12 

0.33 ±0.08 

0.35 ±0.10 


Table 3: Average and standard deviation across the different events of the edge overlap and of the degree-degree 
correlation, for each layer pair. 
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Figure 3: Heat map representing the edge overlap between pairs of layers, averaged over the different events. 

3.2 Degree-degree correlations across layers 

In this section, we study the degree connectivity of users, the most widely studied descriptor of the structure 
of a network. We focus in particular on the in-degree which quantifies the number of users who interacted 
with user i on layer a (a = RT, RP and MT). This is the simplest measure of the importance of the user in 
the network. 

First, we explore if users have the same connectivity on the different layers, or not, i.e. if the users consistently 
have the same degree of importance on all the layers, or not. To this aim, we compute the Spearman’s rank 
correlation coefficient m between the in-degree of users on one layer and their in-degree on a different layer, for 
each pair of layers. The results, averaged across the different events, are reported in Figure]^ with statistical 
details reported in Table The value of two degree-degree correlations out of three is about 0.35, and the 
third - and highest - correlation is 0.5. This means that users tend to have different in-degree values on the 
different layers, i.e. a highly retweeted user is most likely not to be mentioned or replied to by as many users. 
This result represents a second important indicator that the different types of interaction produce different 
networks and should be considered separately in realistic modeling of individual dynamics. 


3.3 Degree distribution per layer 

Building on the result discussed in the previous section, we also explore, for each event, the distribution of the 
in-degree on the different layers, separately. Intriguingly, for each layer, we find that the empirical distributions 
corresponding to the all exceptional events present very similar shape, as shown in Figurej^ This result suggests 
that individuals’ communications on Twitter present some universal characteristics across very different types 
of events . 

The in-degree, shown in Figure exhibits a power-law distribution for about three order of magnitudes. To 
validate our observation, we fit a power law to each distribution following a methodology similar to the one 
introduced in [32]. By noticing that the in-degree is a discrete variable, we estimate the scaling exponent of a 
discrete power law for each empirical distribution. The goodness of fit is estimated by using the Chi Square 
test [33|. We find that the null hypothesis that the data is described by a discrete power law is accepted for all 
empirical distributions with a confidence level of 99%. We have tested other hypotheses, by considering other 
distributions with fat tails such as lognormal, exponential, Gumbel’s extreme values, and Poisson. In the cases 
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Figure 4: Heatmap representing the average degree-degree correlation between layer pairs. 
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Figure 5: Distribution of the in-degree for each event considered in this study (encoded by points with different 
shape and color) and each layer: retweets (left), mentions (center), and replies (right). 
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Figure 6: Notched box plots showing the value of the scaling exponent of the in-degree distribution for each 
layer. Each box aggregates the values corresponding to the different events considered. Notched box plots 
present a contraction around the median, whose height is statistically important: if the notches of two boxes 
do not overlap, this offers evidence of a statistically significant difference between the two medians. This is the 
case here, meaning that the median scaling exponent of the in-degree distribution of each of the three layer is 
different from the exponent characterizing the in-degree distribution of the other layers. 


where the null hypothesis is accepted with the same confidence level, we used the Akaike information criterion 
(AIC) [341135] to select the best model. It is worth remarking that, in all cases, we find that the power law 
provide the best description of the data. 

Power-law distributions of the degree have been found in a large variety of empirical social networks [36] . 
Here, the main finding of our results is that each kind of interaction presents a different scaling exponent. 
To show this, in Figure we report three notched box plots, each corresponding to a different layer and 
including the information about the different events. Notched box plots present a contraction around the 
median, whose height is statistically important: if the notches of two boxes do not overlap, this offers evidence 
of a statistically significant difference between the two medians. This is indeed the case in Figure]^ meaning 
that the median scaling exponent of the in-degree distribution of each of the three layer is different from the 
exponent characterizing the in-degree distribution of the other layers. The fact that the in-degree distributions 
corresponding to the different types of interaction are characterized by different scaling exponents indicates 
that the dynamics of each type of interaction in Twitter should be modeled as a distinct process, and that 
existing models of Twitter activity that do not take into account this fact should be carefully rethought. 


3.4 Average clustering per layer 


Lastly, for each layer separately, we calculate the average clustering coefficient of the corresponding network. 
This is a measure of the transitivity of the observed interactions, and constitutes an important metric to 
characterize social networks m In particular, for each event and each layer, we compute the average local 
clustering coefficient defined by 


C = 


1 

N 


N 


Ec.. 


( 2 ) 















Figure 7: Notched box plots showing the value of the average clustering coefficient for each layer. Each box 
aggregates the values corresponding to the different events considered. 


where 

^ ^ ^ 

Uh - 1 ) 

where Cjk indicates the edge between users j and k. We show in Figure]^ the values of the clustering coefficient 
using three notched box plots, each corresponding to a different layer and including the information about 
the different events. The mention network has the highest clustering level, whereas the reply network has the 
lowest one. The clustering level of the retweet network is the most variable across events, however the three 
medians are again different because the notches do not overlap. This result is a further confirmation that the 
three layers, and therefore the three types of interaction that they represent, form different network topologies 
and that the dynamical processes producing them are thus distinct. 


4 Discussion 


In this paper we analyze six datasets consisting of Twitter conversations surrounding distinct exceptional 
events. The considered events span over very different topics: entertainment, science, commemorations, sports, 
activism, and politics. Our results show that, despite the different fluctuations in time and in volume, there are 
some statistical regularities across the different events. In particular, we find that the in-degree distribution 
of users and the clustering coefficient in each of the three layers (representing interactions based on retweet, 
replies, and mentions, respectively) are the same across the six different events. Our first conclusion is therefore 
that users behavior on Twitter - during exceptional events - presents some universal patterns. 

Secondly, we show that different types of interactions between users on Twitter (retweeting, replying and 
mentioning) generate networks presenting different topological characteristics. These differences were captured 
making use of the multilayer network framework: instead of discarding the information contained in the tweets 
regarding how users interact, we use this information to build a more complete representation of the system by 
means of three layers, each representing a different type of interaction. The fact that networks corresponding 
to different layer present different statistical properties is an important hint for models aiming at reproducing 
human behavior in online social networks. Our results indicate that, to faithfully represent how users interact, 
these models cannot be based on an aggregated view of the network and should account for all the different 
processes taking place in the system, separately. 
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